home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
asmutil
/
afloat.zip
/
AFLOAT.DOC
next >
Wrap
Text File
|
1988-03-16
|
13KB
|
259 lines
User's guide to single-precision floating-point library FLOAT.LIB.
Assembly Language Version.
Copyright Bob Kline 1988
General notes:
One of the frustrations which assembly-language pro-
grammers occasionally encounter is the lack of in-
structions for dealing with floating-point operations.
Even in architectures where a math coprocessor is
designed to handle floating-point work efficiently,
the problem is often still not resolved, as it fre-
quently turns out that most machines are not equipped
with the extra math chip. A good example is the IBM
PC family and clones, which can use a math coprocessor
to do floating-point work in hardware. Only a very
small percentage of these computers are equipped with
such a coprocessor.
This library consists of procedures for performing
basic single-precision floating-point operations in
assembly language programs written for the 808x family
of computers not equipped with a math coprocessor.
The procedures have been built for use with the 'small'
memory model, that is, for programs which use 16-bit
pointers and near procedure calls. It should not be
an inordinantly difficult task to rework the code for
other models, as (with the exception of the procedures
which take in or produce strings: ATOF, FTOA, and FTOE),
all parameters and results are passed in hardware
registers, and the new simplified .MODEL directives (a
feature of MASM 5.0) have been used. Although at this
point only the most fundamental operations have been
provided, it is hoped that a subsequent student or
team of students will build on this foundation, adding
some of the geometric and exponential functions.
All floating-point values handled by the library are
4-byte single-precision reals in the format established
by the IEEE standard. The details of this format are as
follows: the sign is stored in the most significant bit
(bit 31) of the 32-bit representation (set to 1 for a
negative number, reset to 0 for a positive value); the
exponent is stored as an eight-bit value in bits 23
through 30, and is biased by adding 127 to its actual
value -- that is, a (binary) exponent of 0 is
represented as 127, -1 as 126, 1 as 128, and so on; the
fractional portion of the mantissa is represented in
bits 0 through 22; except for a value of 0.0, which is
represented with all 32 bits of the float reset to
zeros, the integer bit of the mantissa is understood to
be 1, and does not therefore need to be explicitly
stored as part of the number. In terms of decimal
representation, this format can handle a range from an
exponent of -38 through +38, with a precision of at
least 7 (possibly 8) digits.
Names for the procedures have been chosen which would
not conflict with those used by the math coprocessor.
So, for example, the routine to add to floating-point
numbers is called F_ADD rather than FADD.
The following descriptions give details about how each
of the procedures works, including which registers are
used for incoming and outgoing values, assumptions made
by the procedures, and behavior with error conditions.
The list is headed by a description of _errno, a global-
ly available variable used as a flag by several of the
procedures to indicate that an error has occurred.
A companion file is provided, FLOAT.INC, which can be
included in the source code for modules using this li-
brary, giving the assembler access to the table of ex-
ternal names which it will need for finding the library
routines.
*---------------------------------------------------------------*
EXTRN _errno:WORD
Used by several of the math routines to signal that an
error has taken place. Two possible values are EDOM
(33), which indicates that an invalid parameter has been
passed, such as a divisor of zero; and ERANGE (34),
which indicates that the result does not fall within the
range of values which can be represented by the type
specified for the result. The calling routine should
reset _errno to zero before the math operation is invoked
to ensure that a subsequent check is not looking at the
results of an error from an earlier operation.
*---------------------------------------------------------------*
ATOF
Converts the string pointed to by SI to a floating-
point value. Leading spaces are skipped. The string
contains an optional sign, followed by a series of
decimal digits, possibly with a decimal point at any
point in the series of digits, followed optionally
by 'e' or 'E' and a (possibly signed) integer
indicating the exponent. The calling function in
responsible for making sure that the input string
actually contains a valid floating-point value. On
return DX:AX contain the 4-byte real result; in
addition, the values in registers BP, SI, DI, BX,
and CX are destroyed.
*---------------------------------------------------------------*
ITOF
Converts the 2-byte integer value passed in the AX
register to a 4-byte real and stores the result in
DX:AX. In addition, the value in CX is changed.
*---------------------------------------------------------------*
F_ADD
Adds the 4-byte real passed in CX:BX to the 4-byte real
passed in DX:AX and stores the result in DX:AX. If the
result will not fit in a 4-byte real, _errno is set to
ERANGE, and the resulting value in DX:AX is
unpredictable. In addition to DX:AX, registers BX, CX,
SI, DI, and BP and changed.
*---------------------------------------------------------------*
F_SUB
Subtracts the 4-byte real in CX:BX from the 4-byte real
passed in DX:AX, storing the result in DX:AX. If the
result will not fit in a 4-byte real, _errno is set to
ERANGE, and the value in DX:AX is unpredictable. In
addition to the DX:AX registers used for the return
value, the CX, BX, DI, SI, and BP registers will be
altered.
*---------------------------------------------------------------*
FABSVAL
Takes the 4-byte real value contained in DX:AX and makes
it positive; no other registers are affected.
*---------------------------------------------------------------*
FBINTODEC
Takes a 4-byte real value and breaks down the compo-
nents into their decimal equivalents. The real value
passed to the procedure in the DX:AX registers. On
return the DX:AX registers contain the mantissa, with
a decimal portion understood to be at the end, the CX
register contains the sign in its low bit, and the BX
register contains the signed, unbiased decimal expo-
nent. In addition, the values in the SI and DI re-
gisters are destroyed.
*---------------------------------------------------------------*
FCMP
Compares 4-byte real value in DX:AX with 4-byte real
value in CX:BX to determine which is the larger. On
return AX contains a positive integer if the value in
DX:AX is greater than that in CX:BX, a negative integer
if DX:AX is less than CX:BX, and zero if the two values
are equal. In addition to the return value in AX,
registers DX, CX, BX, SI, DI, and BP are changed.
*---------------------------------------------------------------*
FDECTOBIN
Converts the decimal components of a floating-point
number to a 4-byte real in IEEE format. On entry, DX:AX
contain the mantissa with the decimal point understood
to be at the end, CX contains the sign in its low bit,
and